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I. INTRODUCTION 

Science analysis is the process by which observations 
are transformed into scientific insight and understanding. 
In the same way that even the most artful analysis cannot 
compensate for poor data, so even the best instruments 
and observational skill can compensate for an inability 
to adequately analyze the data. LISA is no exception. 

Exactly because LISA is a pathfinder for a new scien- 
tific discipline — gravitational wave astronomy — LISA 
data processing and science analysis methodologies are in 
their infancy and require considerable maturation if they 
are to be ready to take advantage of LISA data. Here we 
offer some thoughts, in anticipation of the LISA Science 
Analysis Workshop, on analysis research problems that 
demonstrate the capabilities of different proposed analy- 
sis methodologies and, simultaneously, help to push those 
techniques toward greater maturity. Particular empha- 
sis is placed on formulating questions that can be turned 
into well-posed problems involving tests run on specific 
data sets, which can be shared among different groups 
to enable the comparison of techniques on a well-defined 
platform. 

The questions, from which demonstration problems 
can be posed, are organized by source type. Accompa- 
nying each set of questions is a short discussion meant 
to provide context and motivation for the questions that 
follow. 



II. TECHNOLOGY READINESS LEVELS 

One way to measure the maturity of LISA data pro- 
cessing and science analysis technology techniques is to 
use the NASA Technology Readiness Level (TRL) metric. 
TRLs provide a systematic measurement of the maturity 
of a particular technology (hardware or software) relative 
to mission goals (Q). Table in describes the NASA TRLs 
for software. When LISA science data becomes available 
the software necessary for data processing and science 
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analysis related to LISA science requirements should be 
at least TRL 7 and preferably TRL 8. When LISA sci- 
ence results are released the software should be at TRL 
8. 

We are aware of no LISA analysis methodologies be- 
yond TRL 2 and the principal goal of the questions posed 
here is to point the way toward elevating the TRL level 
of LISA analysis technology. For these questions to be 
useful in this regard they must be attuned to the present 
level of analysis sophistication. Thus, the problems de- 
scribed here are focused on demonstrating capability at 
the level of TRL 2 or TRL 3. Later demonstration prob- 
lems will focus on further developing data processing and 
science analysis technologies to higher TRLs. 



III. VERIFICATION BINARIES 

The verification binaries are a unique subset of the 
resolved galactic binaries described in the next section. 
Verification binaries are systems that have been identi- 
fied pre-science operation and that are well character- 
ized through more traditional astronomical observations. 
This characterization of the verification binaries makes 
it possible make it possible to accurately predict the 
strength, polarization, and propagation direction of the 
gravitational waves from the source. LISA's response 
and function can thus be verified from its observations of 
these systems. 

The verification sources will be among the first tar- 
gets in a search of the LISA data. The results of those 
searches will be used to validate and confirm the perfor- 
mance and expectations for the software, instrumental 
noise, and hardware performance of the observatory. As 
such, these binaries will play a vital role in character- 
izing early LISA performance, and specific analysis will 
need to be developed to address this special population 
of sources. Questions of particular interest include: 

• How soon after observations begin can you identify 
a verification binary, ignoring other sources? With 
other sources (binaries, supermassive black holes, 
extreme mass ratio inspirals, etc)? 

• How does knowledge of a verification binary's pa- 
rameters change as a function of LISA observing 
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TABLE I NASA Technology Readiness Levels for software. 



TRL 1 


Basic principles observed and reported. Basic pvopcTtics of algovithins, vepTesentations 
& concepts. Mathematical formulations. Mix of basic and applied research. 


TRL 2 


Technology concept and/or application formulated. Basic principles coded. Experi- 
ments with synthetic data. Mostly applied research. 


TRL 3 


Analytical and experimental critical function and/or characteristic proof-of-concept. 
Limited functionality implementations. Experiments with small representative data 
sets. Scientific feasibility fully demonstrated. 


TRL 4 


Module and/or subsystem validation in laboratory environment. Standalone prototype 
implementations. Experiments with full-scale problems or data sets. 


TRL 5 


Module and/or subsystem validation m relevant environment. Prototype implementa- 
tions conform to target environment /interfaces. Experiments with realistic problems. 
Simulated interfaces to existing systems. 


TRL 6 


System/subsystem prototype demonstration in a relevant end-to-end environment. 
Prototype implementations if the software is on full-scale realistic problems. Partially 
integrated with existing hardware/ software systems. Limited documentation available. 
Engineering feasibility fully demonstrated. 


TRL 7 


System prototype demonstration in high-fidelity environment (parallel or shadow 
mode operation). Most of the software is functionality available for demonstration 
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and test. Well integrated with operational hardware/software systems. Most software 
bugs removed. Limited documentation available. 


TRL 8 


Actual system completed and "mission qualified" through test and demonstration 
in an operational environment. Thoroughly debugged software. Fully integrated with 
operational hardware and software systems. Most user documentation^ training doc- 
umentation, and maintenance documentation completed. All functionality tested in 
simulated and operational scenarios. Validation & Verification completed. 


TRL 9 


Actual system "mission proven" through successful mission operations. Thoroughly 
debugged software. Fully integrated with operational hardware/software systems. All 
documentation has been completed and users have successful operational experience. 
Sustaining software-engineering support in place. Actual system fully demonstrated. 



time? How long must LISA observations last to re- 
cover the verification parameters to the level of ac- 
curacy provided by electromagnetic observations? 

Early studies on LISA observations of verification bi- 
naries have started Q). Prospective LISA verification 
binaries have been identified and a database of the cur- 
rent known parameters for these binaries is being main- 
tained I2I) for use by the LISA community. 



IV. GALACTIC BINARIES 

Stellar mass galactic binary systems are the most abun- 
dant of the sources LISA is capable of observing. Crude 
estimates place the number of binaries that LISA can re- 
solve as distinct sources in the tens of thousands (0; 0) , 
with millions more forming an unresolvable background 
at lower frequencies. The large population of resolvable 
binaries provides opportunities to develop a more com- 
plete map of the galaxy, study the mass distribution of 
binary components, and study the population and evo- 
lution of mass transfer systems. The unresolvable bi- 
nary background provides additional information about 
the number of binaries and their galactic distribution. Fi- 
nally, because the signal from binaries is ever-present, sig- 



nals from other sources must be identified and character- 
ized in the forest of resolvable binaries and the fog of the 
confusion background. The ability to identify and char- 
acterize isolated binaries and the confusion background 
is thus a crucial first challenge for LISA science analysis. 

A. Isolated binaries 

In addition to the verification binaries described in 
§ IIIII there will be several thousand resolvable binaries 
which will be unknown and uncharacterized before LISA 
begins observations. The ability to identify, character- 
ize, and extract science from observations of these bina- 
ries will depend largely on the analysis technique used. 
Specific questions which are of interest for an individual 
binary analysis algorithm are: 

• Given a realistic galactic model, how many indi- 
vidual binary sources can be resolved? How accu- 
rately can resolved binaries be characterized? How 
does the characterization change as observing time 
increases? Does the method mistakenly identify 
"false binaries"? 

• How accurately can the different binary parameters 
(sinz, amplitude, etc) be determined as function of 
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SNR, sky location and observing time? 

• Given a particular analysis technique, what is 
LISA'S "resolving power"? How well can the tech- 
nique spatially resolve individual binaries on the 
sky? How well can individual binaries be resolved 
in frequency? 

A number of analysis techniques targeting isolated bi- 
naries have appeared in the literature 10; These 
techniques have explored a variety of approaches with 
regard to identification and parameterization of binaries; 
they have yet to be compared and contrasted directly. 

B. Confusion background 

Below some frequency every analysis techniques target- 
ing individual binary sources will break down as overlap- 
ping signals from the millions of short period binaries in 
the galaxy merge to form a confusion-limited background. 

The confusion-limited background is both a boon and 
a bane. The background amplitude, shape, and angular 
distribution depends on the astrophysics of binary evo- 
lution, the total number of binaries contributing to the 
confusion, and the shape of the galaxy. By measuring 
this background amplitude, spectrum and angular distri- 
bution on the sky we are measuring these characteristics 
of our galaxy. On the other hand, the confusion-limited 
background is an astrophysical source of noise that lim- 
its our ability to identify other sources at low frequencies. 
Understanding the onset of confusion will play an impor- 
tant in understanding the low-frequency science that is 
possible with LISA observations. Interesting questions 
that can be posed of techniques targeting the confusion 
limit include: 

• How well can the spectrum (shape and level) of 
the confusion noise be determined as a function of 
frequency and the confusion spectral density? 

• How well can the spatial distribution of the confu- 
sion be determined? 

• How does the characterization of the confusion 
spectrum evolve with increased observing time? 

A great deal of astrophysical analysis has gone into 
predicting the possible populations that will contribute 
to the confusion limited background. A variety of tech- 
niques have been considered to begin to approach the 
question of how LISA will view the background (0; E; 



C. From isolated to confused 

The number density of galactic binaries increases 
rapidly with decreasing frequency; thus, at high frequen- 
cies we have isolated binaries while at low frequencies the 



binaries are unresolvable and we will not be able to iden- 
tify the signal from a single binary. How the fraction of 
resolvable binaries decreases with decreasing frequency 
directly affects our ability to observe sources that may 
be situated in the transition band. 

• Given a realistic model of the galactic binary distri- 
bution, how does confusion "emerge" as a function 
of frequency (binary period)? 

• How does the "fog" of confusion "lift" as LISA ob- 
servations progress? 

• There will always be exceptionally bright sources, 
which stand-out above the confusion. How does 
the number of such exceptional binaries vary with 
frequency? 

An important element in research studies that target 
problems relating to the galactic binaries is the availabil- 
ity of galactic realizations. Several different realizations 
exist, such as those built from binary distribution func- 
tions JstlT^). and those derived from population synthesis 
models OMIH- 

V. BURST SIGNALS OF ASTROPHYSICAL ORIGIN 

LISA can be expected to observe bursts of gravita- 
tional waves from relativistic fly-bys of compact objects 
about supermassive black holes fl^ . More speculative is 
the radiation from the disruption of a main sequence or 
white dwarf via a too-close encounter with an intermedi- 
ate mass black hole. Still more speculative is radiation 
from topological defects in cosmic strings ifisl IT^ . 
Specific questions of interest for analysis methods that 
target burst gravitational wave sources include 

• How well are individual bursts resolved in the LISA 
data as a function of signal-to-noise and burst du- 
ration? 

• Is it possible to distinguish a noise burst in the mea- 
surement or sensing functions of the constellation 
from a burst arising from an astrophysical source? 

• Can burst sources of radiation be characterized well 
enough that they can be distinguished by source or 
source type? 

VI. EXTREME MASS RATIO INSPIRALS 

When studying spacetimes, it is natural to discuss the 
motion of a test particle in the background spacetime of 
interest. Nature has been kind enough to provide sys- 
tems that strongly approximate the test body case in the 
extreme mass ratio inspirals (EMRIs): the capture of a 
stellar mass compact secondary object by an intermedi- 
ate or supermassive black hole. With each orbit gravita- 
tional waves carry away energy and angular momentum 
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and, at least while the rate of loss of energy and angu- 
lar momentum is small, the secondary can be thought 
to evolve along a trajectory of geodesies. By studying 
this evolution it may be possible to reconstruct a broad 
family of geodesies and thus "map" the spacetime in the 
neighborhood of a black hole l[l8^. 

EMRI radiation is not necessarily continuously observ- 
able in the LISA band. When the orbits are relativistic 
the radiation is beamed, leading to large amplitude vari- 
ations as the beam follows the secondary in its orbit. 
Additionally, many EMRIs may be in high eccentricity 
orbits, in which case the radiation may only be in the 
LISA band during a small fraction of the orbit. 

Besides being natural laboratories for conducting tests 
of general relativity, the event rate and characteristics of 
EMRIs can lead to insights to the structure and evolution 
of galactic centers. EMRIs allow high precision estimates 
for the central black hole's mass and spin The event 
rate alone gives an indication of the stellar density in the 
cores of galaxies. 

The apparent difficulty associated with detecting EM- 
RIs is that each system is parameterized by up to four- 
teen parameters. The high dimensionality of the pa- 
rameter space hinders the blunt use of standard tem- 
plate matching techniques. Consequently, alternative 
approaches to the EMRI detection and characterization 
problems are required. Early a naly sis methods have 
included semi-coherent searches (|20|) . and the use of 
time- frequency methods l|2ll l2^ . These approaches are 
promising, but are still in the early formulation stages. 

Central issues in EMRI data analysis are: 

• For EMRIs that lead to periodic bursts of radiation 
in the LISA band (owing either to orbital eccentric- 
ity or beaming) can multiple bursts from a single 
EMRI system be linked with each other? 

• What features of an EMRI signal (i.e. location, 
black hole spin, secondary mass, etc.) become "in 
focus" with increased waveform model complexity, 
signal-to-noise ratio, and/or observation duration? 

• How well can a complete EMRI signal be identi- 
fied and characterized in the presence of instru- 
ment noise? A confusion-limited background? A 
confusion-limited background of EMRIs? 



VII. MASSIVE BLACK HOLE BINARIES 

Observing the inspiral, coalescence and ringdown of 
massive black hole binaries will provide critical clues to 
the order in which the large scale structure in the Uni- 
verse evolved: did stars evolve and then galaxies, or 
galaxies and then stars? Did supermassive black holes 
form hierarchically from run-away collision of lower-mass 
black holes, or were they massive at birth, forming from 
the collapse of primordial clouds of gas? LISA can help 
answer these questions by producing a census of merger 



events mass and luminosity distances. To obtain lumi- 
nosity distances it will be necessary to have accurate sky 
positions. For gravitationally "bright" sources this may 
come from the gravitational wave observations them- 
selves i23) : however, for dimmer sources the gravi- 
tational wave estimates of position may be too crude for 
an accurate distance determination, in which case the ob- 
servation of an optical counterpart (i.e., the galaxy host 
of the merger) will be necessary to get an accurate red- 
shift (I2I). 

While the inspiral, coalescence and ringdown of a su- 
permassive black hole will always be detected in the pres- 
ence of the galactic binaries, if we can't identify and char- 
acterize a MBH binary source all by itself we'll never 
be able to identify and characterize a MBH binary in 
the presence of the galactic binary forest and confusion 
background. Therefore, each of the following three ques- 
tions should be answered at three levels: (1) in the ab- 
sence of a galactic binary confusion background, (2) in 
the presence of an artificially "cleaned" background with 
all bright sources removed, and (3) in the presence of a 
full galactic binary background: 

• How well can an SMBH binary be identified and 
characterized? 

• How "bright" must a MBH binary be to be identi- 
fied? How does the accuracy of the MBH charac- 
terization scale with "brightness" ? 

• How well, as a function of observation time, can you 
determine where and when the binary will coalesce? 
(i.e., what precision a month from coalescence? a 
week? a day?) 

VIII. MULTI-SOURCE CHALLENGES 

The identification of every LISA source will take 
place in the simultaneous presence, in the LISA data, 
of millions of long-period galactic binaries, myriads of 
distinctly resolvable short-period galactic binaries, and 
multiple extreme-mass-ratio inspirals and supermassive 
black hole inspirals. A critical challenge for LISA analysis 
is the ability to identify and characterize these sources in 
each others presence. Central questions in multi-source 
analysis include 

• How well can different source types in the data be 
searched for sequentially? For example, can SMBH 
binaries be found and subtracted out of the data 
before galactic binaries or EMRIs are searched for? 

• How well can different source types in the data be 
searched for simultaneously! 

• What fidelity is required of theoretical source mod- 
els for a given multi-source science analysis proce- 
dure to work? How does the effectiveness of the 
analysis method scale with source simulation fi- 
delity? 
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• How can source catalogs of known sources in the 
LISA data be used to best effect in multi-source 
analysis? 

• Can source catalogs be created from electromag- 
netic observations in advance of LISA? Can source 
catalogs be created directly from the LISA data? 
If they can, how do they change and evolve with 
extended LISA observing periods? 

• How will unmodeled sources be handled by multi- 
source search and characterization procedures? 

In many ways, multi-source analysis synthesizes all the 
challenges related to given source types into a single prob- 
lem. This synthesis represents the inexorable march to- 
ward more realistic simulations of what actual LISA sci- 
ence analysis will look like. 

Several groups have started to make forays into anal- 
ysis of data segments with strongly overlapping sources, 
using a variety of modern algorithms lf26t l27t l28H2^ . 



IX. DATA SETS FOR SCIENCE ANALYSIS CHALLENGES 

Science analysis demonstrations and feasibility stud- 
ies require the use of simulated data that is well- 
characterized and of sufficient fidelity that the feasibility 
demonstration is meaningful. Trade studies or evalua- 
tions and qualification of different technologies are best 
performed under identical conditions; so, there is great 
value in archiving and sharing data sets used for different 
studies so that different analysis methods can be charac- 
terized under the same conditions and their results com- 
pared. An additional advantage of shared data sets for 
science analysis demonstrations and feasibility studies is 
that comparison among studies carried out on the same 
data but with different techniques provides practice for 
the day when real LISA data will be available and there 
is only one LISA data set and all studies will take place 
on the same data. 

Every demonstration or feasibility study has a goal 
that determines the appropriate degree of fidelity (in 
noise characteristics, LISA simulation approximations, 
etc) that the simulated data set must satisfy. The fidelity 
of the data used in a study should not substantially ex- 
ceed that required for a meaningful demonstration in or- 
der to avoid complications in the study's interpretation. 
So, for instance, data sets designed to probe the ability 
of an analysis technique to resolve pairs of binary star 
systems need not be of full bandwidth. To be shareable, 
the data sets should also be complete and fully docu- 
mented. Completeness, in this case, means that the data 
set should contain everything necessary to carry-through 
the analysis: no assumptions about, e.g., the approxi- 
mations made in the simulation (rigid adiabatic LISA? 
second order eccentricity orbits? constellation position 
and phase at the initial epoch?) or in the constellation 



response (what are the observables? low-frequency ap- 
proximate response, or exact response?) should need to 
accompany the data sets. 

Data sets that can be used as a common platform 
for addressing these challenges are currently being de- 
veloped, produced, and distributed by two groups. The 
Testbed for LISA Analysis (TLA) Project, spearheaded 
by the Center for Gravitational Wave Physics, has devel- 
oped a data container (the Simulated LISA Data Prod- 
uct, or SLDP), which was developed to meet the goal of 
completeness as described above. The Mock LISA Data 
Challenges (MLDC) group, organized by the LISA Inter- 
national Science Team Working Group IB, has developed 
the LISAxml data container that is complete in a differ- 
ent sense: LISAxml files include a full description of the 
source content of the data they contain. Both groups, 
which share many members in common, provide software 
for reading and writing data sets in these two different 
format; additionally, the TLA Project will provide SLDP 
versions of the simulated data content of LISAxml files 
provided by the MLDC effort. 

Data sets suitable for addressing several of the sci- 
ence analysis issues presented in this paper, and in 
the recommendations that emerge from the LISA Sci- 
ence Analysis Workshop, will be made available as 
SLDP files through the Testbed for LISA Analysis 
web site <http://tla.gravity.psu.edu>. The TLA 
Project invites the participation of scientists in all as- 
pects of its work, from developing software to sup- 
port collaborative work in LISA science analysis, to 
generating and providing sample data sets for analy- 
sis studies, to contributing to an annotated bibliogra- 
phy of analysis study results, and many things in be- 
tween. For more information on how to become in- 
volved in the TLA Project visit the TLA website at 
< http : //tla . gravity . psu . edu/ get i nvolved/[ > . 

The MLDC effort has developed a systematic se- 
ries of "challenges", which are available through 
their collaborative working wiki hosted at Caltech, 
< http : //www . tapir . caltech . edu/dokuwiki/V (click 
on LISA Science Team Working Group IB). The MLDC 
Group will provide data sets suitable for addressing 
these challenges as LISAxml files through Astrogravs at 
<http://astrogravs.nasa.gov/>. People interested 
in participating in the MLDC effort should visit their 
working wiki for contact information. 

X. FINAL THOUGHTS 

The principal goal of the LISA Science Analysis Work- 
shop is to encourage the development and maturation of 
science analysis technology in preparation for LISA sci- 
ence operations. The principal outcome of the workshop 
will be a report, written by the workshop participants, 
that 

• articulates specific demonstrations of analysis ca- 
pabilities that can (and should!) be addressed by 
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the LISA science analysis community in the next 

1-2 years; 

• defines the specific data sets needed to make these 
demonstrations; 

• identifies the support structure (software tools, 
community forums and meetings) that simpliiy the 
completion of these studies; and 

• provides a forum for the effective communication 
and dissemination of the results of these studies. 

LISA'S best advocates are the scientists whose blood, 
toil, tears and sweat will carry-out the LISA science pro- 
gram, from technology through analysis and science in- 
terpretation. If you arc not already involved in LISA 
science analysis we urge you to become involved, by join- 
ing one or both of the TLA and MLDC projects. 
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